St. Louis County
- Oceania > Australia > South Australia > Adelaide (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Minnesota > St. Louis County > Duluth (0.04)
- (2 more...)
- Overview (0.68)
- Research Report (0.46)
AI Deepfakes Are Impersonating Pastors to Try to Scam Their Congregations
Religious communities around the US are getting hit with AI depictions of their leaders sharing incendiary sermons and asking for donations. Father Mike Schmitz, a Catholic priest and podcaster, addressed his congregation of more than 1.2 million YouTube subscribers in November with an unusual kind of homily. You couldn't always trust the words coming out of his mouth, Schmitz said, because sometimes they weren't really his words--or his mouth. Schmitz had become the target of AI-generated impersonation scams . "You're being watched by a demonic human," said the fake Schmitz in one video that the real Schmitz, wearing an L.L. Bean jacket over his clerical suit, included in his public service announcement as an example.
- North America > United States > California (0.15)
- Asia > China (0.05)
- North America > United States > Texas > Dallas County > Dallas (0.04)
- (9 more...)
- Information Technology > Services (1.00)
- Information Technology > Security & Privacy (1.00)
- Media (0.95)
From Memorization to Reasoning in the Spectrum of Loss Curvature
Merullo, Jack, Vatsavaya, Srihita, Bushnaq, Lucius, Lewis, Owen
We characterize how memorization is represented in transformer models and show that it can be disentangled in the weights of both language models (LMs) and vision transformers (ViTs) using a decomposition based on the loss landscape curvature. This insight is based on prior theoretical and empirical work showing that the curvature for memorized training points is much sharper than non memorized, meaning ordering weight components from high to low curvature can reveal a distinction without explicit labels. This motivates a weight editing procedure that suppresses far more recitation of untargeted memorized data more effectively than a recent unlearning method (BalancedSubnet), while maintaining lower perplexity. Since the basis of curvature has a natural interpretation for shared structure in model weights, we analyze the editing procedure extensively on its effect on downstream tasks in LMs, and find that fact retrieval and arithmetic are specifically and consistently negatively affected, even though open book fact retrieval and general logical reasoning is conserved. We posit these tasks rely heavily on specialized directions in weight space rather than general purpose mechanisms, regardless of whether those individual datapoints are memorized. We support this by showing a correspondence between task data's activation strength with low curvature components that we edit out, and the drop in task performance after the edit. Our work enhances the understanding of memorization in neural networks with practical applications towards removing it, and provides evidence for idiosyncratic, narrowly-used structures involved in solving tasks like math and fact retrieval.
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Minnesota > St. Louis County > Duluth (0.04)
- (3 more...)
Overspecified Mixture Discriminant Analysis: Exponential Convergence, Statistical Guarantees, and Remote Sensing Applications
Bolatov, Arman, Legg, Alan, Melnykov, Igor, Nurlanuly, Amantay, Tezekbayev, Maxat, Assylbekov, Zhenisbek
This study explores the classification error of Mixture Discriminant Analysis (MDA) in scenarios where the number of mixture components exceeds those present in the actual data distribution, a condition known as overspecification. We use a two-component Gaussian mixture model within each class to fit data generated from a single Gaussian, analyzing both the algorithmic convergence of the Expectation-Maximization (EM) algorithm and the statistical classification error. We demonstrate that, with suitable initialization, the EM algorithm converges exponentially fast to the Bayes risk at the population level. Further, we extend our results to finite samples, showing that the classification error converges to Bayes risk with a rate $n^{-1/2}$ under mild conditions on the initial parameter estimates and sample size. This work provides a rigorous theoretical framework for understanding the performance of overspecified MDA, which is often used empirically in complex data settings, such as image and text classification. To validate our theory, we conduct experiments on remote sensing datasets.
- Asia > Middle East > Jordan (0.04)
- Oceania > Australia (0.04)
- North America > United States > Minnesota > St. Louis County > Duluth (0.04)
- (9 more...)
- Oceania > Australia > South Australia > Adelaide (0.04)
- North America > United States > Minnesota > St. Louis County > Duluth (0.04)
- North America > United States > Minnesota > Saint Louis County > Duluth (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Overview (0.68)
- Research Report (0.46)
Safety Pretraining: Toward the Next Generation of Safe AI
Maini, Pratyush, Goyal, Sachin, Sam, Dylan, Robey, Alex, Savani, Yash, Jiang, Yiding, Zou, Andy, Fredrikson, Matt, Lipton, Zacharcy C., Kolter, J. Zico
As large language models (LLMs) are increasingly deployed in high-stakes settings, the risk of generating harmful or toxic content remains a central challenge. Post-hoc alignment methods are brittle: once unsafe patterns are learned during pretraining, they are hard to remove. In this work, we present a data-centric pretraining framework that builds safety into the model from the start. Our framework consists of four key steps: (i) Safety Filtering: building a safety classifier to classify webdata into safe and unsafe categories; (ii) Safety Rephrasing: we recontextualize unsafe webdata into safer narratives; (iii) Native Refusal: we develop RefuseWeb and Moral Education pretraining datasets that actively teach model to refuse on unsafe content and the moral reasoning behind it, and (iv) Harmfulness-Tag annotated pretraining: we flag unsafe content during pretraining using a special token, and use it to steer model away from unsafe generations at inference. Our safety-pretrained models reduce attack success rates from 38.8\% to 8.4\% on standard LLM safety benchmarks with no performance degradation on general tasks.
- North America > United States > Minnesota > St. Louis County > Duluth (0.04)
- North America > United States > Minnesota > Saint Louis County > Duluth (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- Instructional Material (0.92)
- Research Report > New Finding (0.46)
Geometric Learning Dynamics
We present a unified geometric framework for modeling learning dynamics in physical, biological, and machine learning systems. The theory reveals three fundamental regimes, each emerging from the power-law relationship $g \propto κ^α$ between the metric tensor $g$ in the space of trainable variables and the noise covariance matrix $κ$. The quantum regime corresponds to $α= 1$ and describes Schrödinger-like dynamics that emerges from a discrete shift symmetry. The efficient learning regime corresponds to $α= \tfrac{1}{2}$ and describes very fast machine learning algorithms. The equilibration regime corresponds to $α= 0$ and describes classical models of biological evolution. We argue that the emergence of the intermediate regime $α= \tfrac{1}{2}$ is a key mechanism underlying the emergence of biological complexity.
- North America > United States > New York (0.04)
- North America > United States > Minnesota > St. Louis County > Duluth (0.04)
- North America > United States > Minnesota > Saint Louis County > Duluth (0.04)
- (2 more...)
Duluth at SemEval-2025 Task 7: TF-IDF with Optimized Vector Dimensions for Multilingual Fact-Checked Claim Retrieval
Syed, Shujauddin, Pedersen, Ted
This paper presents the Duluth approach to the SemEval-2025 Task 7 on Multilingual and Crosslingual Fact-Checked Claim Retrieval. We implemented a TF-IDF-based retrieval system with experimentation on vector dimensions and tokenization strategies. Our best-performing configuration used word-level tokenization with a vocabulary size of 15,000 features, achieving an average success@10 score of 0.78 on the development set and 0.69 on the test set across ten languages. Our system showed stronger performance on higher-resource languages but still lagged significantly behind the top-ranked system, which achieved 0.96 average success@10. Our findings suggest that though advanced neural architectures are increasingly dominant in multilingual retrieval tasks, properly optimized traditional methods like TF-IDF remain competitive baselines, especially in limited compute resource scenarios.
- North America > United States > Minnesota > St. Louis County > Duluth (0.14)
- North America > United States > Minnesota > Saint Louis County > Duluth (0.14)
- Europe > Austria > Vienna (0.14)
Molecular Learning Dynamics
Gusev, Yaroslav, Vanchurin, Vitaly
We apply the physics-learning duality to molecular systems by complementing the physical description of interacting particles with a dual learning description, where each particle is modeled as an agent minimizing a loss function. In the traditional physics framework, the equations of motion are derived from the Lagrangian function, while in the learning framework, the same equations emerge from learning dynamics driven by the agent loss function. The loss function depends on scalar quantities that describe invariant properties of all other agents or particles. To demonstrate this approach, we first infer the loss functions of oxygen and hydrogen directly from a dataset generated by the CP2K physics-based simulation of water molecules. We then employ the loss functions to develop a learning-based simulation of water molecules, which achieves comparable accuracy while being significantly more computationally efficient than standard physics-based simulations.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > Minnesota > St. Louis County > Duluth (0.04)
- North America > United States > Minnesota > Saint Louis County > Duluth (0.04)
- (4 more...)
Covariant Gradient Descent
Guskov, Dmitry, Vanchurin, Vitaly
We present a manifestly covariant formulation of the gradient descent method, ensuring consistency across arbitrary coordinate systems and general curved trainable spaces. The optimization dynamics is defined using a covariant force vector and a covariant metric tensor, both computed from the first and second statistical moments of the gradients. These moments are estimated through time-averaging with an exponential weight function, which preserves linear computational complexity. We show that commonly used optimization methods such as RMSProp, Adam and AdaBelief correspond to special limits of the covariant gradient descent (CGD) and demonstrate how these methods can be further generalized and improved.
- North America > United States > Minnesota > St. Louis County > Duluth (0.04)
- North America > United States > Minnesota > Saint Louis County > Duluth (0.04)
- North America > United States > Florida > Broward County > Weston (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.58)